Testing for Equal Distributions in High Dimension
نویسندگان
چکیده
We propose a new nonparametric test for equality of two or more multivariate distributions based on Euclidean distance between sample elements. Several consistent tests for comparing multivariate distributions can be developed from the underlying theoretical results. The test procedure for the multisample problem is developed and applied for testing the composite hypothesis of equal distributions, when distributions are unspecified. The proposed test is universally consistent against all fixed alternatives (not necessarily continuous) with finite second moments. The test is implemented by conditioning on the pooled sample to obtain an approximate permutation test, which is distribution free. Our Monte Carlo power study suggests that the new test may be much more sensitive than tests based on nearest neighbors against several classes of alternatives, and performs particularly well in high dimension. Computational complexity of our test procedure is independent of dimension and number of populations sampled. The test is applied in a high dimensional problem, testing microarray data from cancer samples.
منابع مشابه
Comparing the Shape Parameters of Two Weibull Distributions Using Records: A Generalized Inference
The Weibull distribution is a very applicable model for the lifetime data. For inference about two Weibull distributions using records, the shape parameters of the distributions are usually considered equal. However, there is not an appropriate method for comparing the shape parameters in the literature. Therefore, comparing the shape parameters of two Weibull distributions is very important. I...
متن کاملOn the Number of Modes of Finite Mixtures of Elliptical Distributions
We extend the concept of the ridgeline from Ray and Lindsay (2005) to finite mixtures of general elliptical densities with possibly distinct density generators in each component. This can be used to obtain bounds for the number of modes of two-component mixtures of t distributions in any dimension. In case of proportional dispersion matrices, these have at most three modes, while for equal degr...
متن کاملTesting a Point Null Hypothesis against One-Sided for Non Regular and Exponential Families: The Reconcilability Condition to P-values and Posterior Probability
In this paper, the reconcilability between the P-value and the posterior probability in testing a point null hypothesis against the one-sided hypothesis is considered. Two essential families, non regular and exponential family of distributions, are studied. It was shown in a non regular family of distributions; in some cases, it is possible to find a prior distribution function under which P-va...
متن کاملThe distance correlation t-test of independence in high dimension
AMS subject classifications: primary 62G10 secondary 62H20 Keywords: dCor dCov Multivariate independence Distance covariance Distance correlation High dimension a b s t r a c t Distance correlation is extended to the problem of testing the independence of random vectors in high dimension. Distance correlation characterizes independence and determines a test of multivariate independence for rand...
متن کاملIndistinguishability of Absolutely Continuous and Singular Distributions
It is shown that there are no consistent decision rules for the hypothesis testing problem of distinguishing between absolutely continuous and purely singular probability distributions on the real line. In fact, there are no consistent decision rules for distinguishing between absolutely continuous distributions and distributions supported by Borel sets of Hausdorff dimension 0. It follows that...
متن کامل